Search CORE

6 research outputs found

Policy learning in Continuous-Time Markov Decision Processes using Gaussian Processes

Author: Bartocci Ezio
Bortolussi Luca
Brázdil Tomás
Milios Dimitrios
Sanguinetti Guido
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Continuous-time Markov decision processes provide a very powerful mathematical framework to solve policy-making problems in a wide range of applications, ranging from the control of populations to cyber\u2013physical systems. The key problem to solve for these models is to efficiently compute an optimal policy to control the system in order to maximise the probability of satisfying a set of temporal logic specifications. Here we introduce a novel method based on statistical model checking and an unbiased estimation of a functional gradient in the space of possible policies. Our approach presents several advantages over the classical methods based on discretisation techniques, as it does not assume the a-priori knowledge of a model that can be replaced by a black-box, and does not suffer from state-space explosion. The use of a stochastic moment-based gradient ascent algorithm to guide our search considerably improves the efficiency of learning policies and accelerates the convergence using the momentum term. We demonstrate the strong performance of our approach on two examples of non-linear population models: an epidemiology model with no permanent recovery and a queuing system with non-deterministic choice

Archivio istituzionale della ricerca - Università di Trieste

Edinburgh Research Explorer

Sissa Digital Library

Approximating values of generalized-reachability stochastic games

Author: Baier Christel
Basset Nicolas
Brenguier Romain
Brázdil Tomás
Chatterjee Krishnendu
Chatterjee Krishnendu
Chen Taolue
Chen Taolue
Condon Anne
Forejt Vojtech
Kwiatkowska Marta
Kwiatkowska Marta Z.
Randour Mickael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

Simple stochastic games are turn-based 2½-player games with a reachability objective. The basic question asks whether one player can ensure reaching a given target with at least a given probability. A natural extension is games with a conjunction of such conditions as objective. Despite a plethora of recent results on the analysis of systems with multiple objectives, the decidability of this basic problem remains open. In this paper, we present an algorithm approximating the Pareto frontier of the achievable values to a given precision. Moreover, it is an anytime algorithm, meaning it can be stopped at any time returning the current approximation and its error bound

arXiv.org e-Print Archive

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Publikationsserver der RWTH Aachen University

Efficient Analysis of Probabilistic Programs with an Unbounded Counter

Author: Antonín Kŭcera
Brázdil T.
Brázdil T.
Chatterjee K.
Křetínský J.
Stefan Kiefer
Tomás Brázdil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Stochastic Shortest Paths and Weight-Bounded Properties in Markov Decision Processes

Author: Abdulla Parosh Aziz
Baier Christel
Brázdil Tomás
Brázdil Tomás
Chatterjee Krishnendu
Chatterjee Krishnendu
de Alfaro Luca
de Alfaro Luca
Kallenberg Lodewijk
Krähmann Daniel
Puterman Martin L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/04/2018
Field of study

International audienceThe paper deals with finite-state Markov decision processes (MDPs) with integer weights assigned to each state-action pair. New algorithms are presented to classify end components according to their limiting behavior with respect to the accumulated weights. These algorithms are used to provide solutions for two types of fundamental problems for integer-weighted MDPs. First, a polynomial-time algorithm for the classical stochastic shortest path problem is presented, generalizing known results for special classes of weighted MDPs. Second, qualitative probability constraints for weight-bounded (repeated) reachability conditions are addressed. Among others, it is shown that the problem to decide whether a disjunction of weight-bounded reachability conditions holds almost surely under some scheduler belongs to NP ∩ coNP, is solvable in pseudo-polynomial time and is at least as hard as solving two-player mean-payoff games, while the corresponding problem for universal quantification over schedulers is solvable in polynomial time

HAL-CentraleSupelec

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

LNCS

Author: Benjamin Aminof
BL Kaminski
BL Kaminski
C Baier
C Rackoff
E Mayr
HC Yen
J Esparza
J Esparza
J Hoffmann
Javier Esparza
Jérôme Leroux
K Chatterjee
K Chatterjee
K Etessami
Krishnendu Chatterjee
Laura Bozzelli
Marcin Jurdziński
MF Atig
Moritz Sinn
N Foster
R Bloem
R Wilhelm
RM Karp
S Schmitz
S Thrun
Sumit Gulwani
T Brázdil
T Brázdil
Tomás Brázdil
Van Chan Ngo
Y Velner
Z Ghahramani
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

A probabilistic vector addition system with states (pVASS) is a finite state Markov process augmented with non-negative integer counters that can be incremented or decremented during each state transition, blocking any behaviour that would cause a counter to decrease below zero. The pVASS can be used as abstractions of probabilistic programs with many decidable properties. The use of pVASS as abstractions requires the presence of nondeterminism in the model. In this paper, we develop techniques for checking fast termination of pVASS with nondeterminism. That is, for every initial configuration of size n, we consider the worst expected number of transitions needed to reach a configuration with some counter negative (the expected termination time). We show that the problem whether the asymptotic expected termination time is linear is decidable in polynomial time for a certain natural class of pVASS with nondeterminism. Furthermore, we show the following dichotomy: if the asymptotic expected termination time is not linear, then it is at least quadratic, i.e., in Ω(n2)

Crossref

IST Austria: PubRep (Institute of Science and Technology)

Placing unprecedented recent fir growth in a European-wide and Holocene-long context

Author: Brázdil Rudolf
Büntgen Ulf
Bürgi Matthias
Carrer Marco
Esper Jan
Hagedorn Frank
Helle Gerhard
Heussner Karl-Uwe
Hofmann Jutta
Julio Camarero J.
Kaplan Jed O.
Kontic Raymond
Kyncl Josef
Kyncl Tomás
Liebhold Andrew
Schaub Michael
Tegel Willy
Tinner Willy
Publication venue: 'Wiley'
Publication date: 01/01/2014
Field of study

Forest decline played a pivotal role in motivating Europe's political focus on sustainability around 35 years ago. Silver fir (Abies alba) exhibited a particularly severe dieback in the mid-1970s, but disentangling biotic from abiotic drivers remained challenging because both spatial and temporal data were lacking. Here, we analyze 14 136 samples from living trees and historical timbers, together with 356 pollen records, to evaluate recent fir growth from a continent-wide and Holocene-long perspective. Land use and climate change influenced forest growth over the past millennium, whereas anthropogenic emissions of acidic sulfates and nitrates became important after about 1850. Pollution control since the 1980s, together with a warmer but not drier climate, has facilitated an unprecedented surge in productivity across Central European fir stands. Restricted fir distribution prior to the Mesolithic and again in the Modern Era, separated by a peak in abundance during the Bronze Age, is indicative of the long-term interplay of changing temperatures, shifts in the hydrological cycle, and human impacts that have shaped forest structure and productivity

Infoscience - École polytechnique fédérale de Lausanne

OPUS Augsburg

Bern Open Repository and Information System (BORIS)